README for "MAD: A Multimodal Anomaly Detection Framework..."
This repository contains the official PyTorch implementation for the paper "MAD: A Multimodal Anomaly Detection Framework Based on Shared Transformer and Contrastive Learning for Smart Manufacturing". The code is organized to allow for the complete reproduction of our main experimental results, following a three-step pipeline: data preprocessing, model pre-training with Supervised Contrastive (SupCon) loss, and final fine-tuning with Cross-Entropy (CE) loss.

1. Setup
To begin, we recommend setting up a dedicated Python 3.8+ virtual environment to manage dependencies. Once the environment is activated, all required libraries can be installed by running pip install -r requirements.txt in your terminal. This process will prepare your system to execute the reproduction scripts.

2. Code Structure
The main logic is organized into several scripts and modules within the src directory. The src/scripts/ folder contains the primary executable files: preprocess_data.py for converting raw data into tensors, pretrain_model.py for the SupCon pre-training stage, finetune_model.py for the final CE fine-tuning, and inference.py to run predictions with a trained model. The core definitions for the model architecture and the custom dataset class are located in src/models.py and src/datasets.py, respectively.

3. How to Reproduce Main Results
Note on the Dataset: For the convenience of reviewers and to ensure quick reproducibility, this supplementary material includes a 1/100th downsampled version of the raw dataset, which is located in the data/raw directory. The following instructions can be run directly on this provided sample.

The reproduction process begins by preparing the dataset. First, ensure the sample data is structured in class-specific subdirectories inside data/raw. After confirming the data is in place, execute the preprocessing script to convert the raw images and signals into tensor patches suitable for the model. This is done by running python -m src.scripts.preprocess_data --data_dir data/raw --output_dir data/processed, which will populate the data/processed directory.

With the data prepared, you can proceed to the first stage of our two-stage training strategy: pre-training. This step trains the MAD model using SupCon loss to learn a structured representation space. Run the command python -m src.scripts.pretrain_model --data_dir data/processed --output_dir pretrained_weights. The best-performing model weights from this stage will be saved as model_best_pretrained.pth.tar inside the pretrained_weights folder.

Finally, the fine-tuning stage uses these pre-trained weights to train the final classifier. Execute the command python -m src.scripts.finetune_model --data_dir data/processed --pretrained_path pretrained_weights/model_best_pretrained.pth.tar --output_dir results/final_model. This script will load the weights, fine-tune the model, and save the final results and logs to the results/final_model directory.

4. How to Use the Trained Model (Inference)
After successfully running the fine-tuning script, you can use the final model to make predictions on a sample from the dataset. The inference.py script will load the best fine-tuned model and its corresponding configuration to predict a single, randomly selected sample. Run the following command, pointing to the final model weights and the configuration file saved in the previous step: python -m src.scripts.inference --weights_path results/final_model/model_best_finetuned.pth.tar --config_path results/final_model/config_finetune_used.yaml --data_dir data/processed. The script will then output the true label of the sample and the model's prediction along with its confidence score.


